Model Selection

WebLI Pretraining

# WebLI Pretraining

Vit So400m Patch16 Siglip Gap 384.v2 Webli

A ViT image encoder based on SigLIP 2, utilizing global average pooling, with the attention pooling head removed, suitable for image feature extraction tasks.

Image Classification

Vit Giantopt Patch16 Siglip Gap 384.v2 Webli

A ViT image encoder based on SigLIP 2, utilizing global average pooling and removing the attention pooling head, suitable for image feature extraction tasks.

Image Classification

Vit Base Patch32 Siglip Gap 256.v2 Webli

A vision Transformer model based on SigLIP 2, using Global Average Pooling (GAP) instead of attention pooling head for image encoding

Vit Gopt 16 SigLIP2 256

SigLIP 2 vision-language model trained on WebLI dataset, suitable for zero-shot image classification tasks.

Vit SO400M 14 SigLIP2

A SigLIP 2 vision-language model trained on the WebLI dataset, suitable for zero-shot image classification tasks.

Vit L 16 SigLIP2 384

A SigLIP 2 vision-language model trained on the WebLI dataset, suitable for zero-shot image classification tasks.

Vit B 16 SigLIP2

A SigLIP 2 vision-language model trained on the WebLI dataset, suitable for zero-shot image classification tasks.

Siglip2 So400m Patch16 256

SigLIP 2 is an improved model based on SigLIP, integrating multiple technologies to enhance semantic understanding, localization, and dense feature extraction capabilities.

Siglip2 Base Patch16 224

SigLIP 2 is an improved multilingual vision-language encoder based on SigLIP, enhancing semantic understanding, localization, and dense feature extraction capabilities.

Siglip So400m Patch16 256 I18n

A multimodal model based on the SoViT backbone network, improved with the Sigmoid loss function, supporting zero-shot image classification and image-text retrieval

Vit SO400M 14 SigLIP 384

SigLIP (Sigmoid Loss for Language-Image Pretraining) model trained on the WebLI dataset, suitable for zero-shot image classification tasks.

Vit L 16 SigLIP 384

SigLIP (Sigmoid Loss for Language-Image Pre-training) model trained on the WebLI dataset for zero-shot image classification tasks.

Vit B 16 SigLIP 512

SigLIP (Sigmoid Loss Language-Image Pretraining) model trained on the WebLI dataset for zero-shot image classification tasks

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase